pacman::p_load(tmap, sf, sfdep, tidyverse, mapview, RColorBrewer)Take-home Exercise 1
0. Goals & Objectives.
On this page, we define Exploratory Spatial Data Analysis (ESDA) to understand spatial and spatio-temporal mobility patterns of public bus passengers in Singapore.
Due to the nature of EDA and Data Analysis, parts of this page have been Collapsed or placed behind tabs, to avoid excessive scrolling.
0.1 Motivation
Public transport is a key concern for residents in land-scarce, population-dense Singapore. With COEs reaching record highs and authorities announcing the termination of bus services, there is no better time to understand the public transport network and systems in Singapore.
By understanding and modelling patterns of public transport consumption - including specifically Local Indicators of Spatial Autocorrelation (LISA), insights can be provided to both public and private sector to formulate more informed decisions that for the benefit of the public, and not just for profit.
1. Geospatial Data Wrangling
This study was performed in R, written in R Studio, and published using Quarto.
1.1 Import Packages
This function calls pacman to load sf, tidyverse, tmap, knitr packages;
tmap: For thematic mapping; powerful mapping packagesf: for geospatial data handling, but also geoprocessing: buffer, point-in-polygon count, etc- batch processing over GIS packages; can handle tibble format
sfdep: creates space-time cube, EHSA; replaces spdeptidyverse: for non-spatial data handling; commonly used R package and containsdplyrfor dataframe manipulationmapview: interactive plotting & manipulationRColorBrewer: custom colour palettes for manipulation
1.2 Import Data
1.2.1 Geospatial Load Bus Stop Data
- First, we load
BusStopshapefile data from LTA Datamall. st_read()is used to import the ESRI Shapefile data into ansfdataframe
show code
raw_bus_stop_sf <- st_read(dsn = "data/geospatial",
layer = "BusStop") Reading layer `BusStop' from data source
`C:\1darren\ISSS624\Take-home_Ex01\data\geospatial' using driver `ESRI Shapefile'
Simple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21
show code
head(raw_bus_stop_sf)Simple feature collection with 6 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 13228.59 ymin: 30391.85 xmax: 41603.76 ymax: 44206.38
Projected CRS: SVY21
BUS_STOP_N BUS_ROOF_N LOC_DESC geometry
1 22069 B06 OPP CEVA LOGISTICS POINT (13576.31 32883.65)
2 32071 B23 AFT TRACK 13 POINT (13228.59 44206.38)
3 44331 B01 BLK 239 POINT (21045.1 40242.08)
4 96081 B05 GRACE INDEPENDENT CH POINT (41603.76 35413.11)
5 11561 B05 BLK 166 POINT (24568.74 30391.85)
6 66191 B03 AFT CORFE PL POINT (30951.58 38079.61)
- We check the coordinate reference system with
st_crs();
show code
st_crs(raw_bus_stop_sf)Coordinate Reference System:
User input: SVY21
wkt:
PROJCRS["SVY21",
BASEGEOGCRS["WGS 84",
DATUM["World Geodetic System 1984",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]],
ID["EPSG",6326]],
PRIMEM["Greenwich",0,
ANGLEUNIT["Degree",0.0174532925199433]]],
CONVERSION["unnamed",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["Degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["Degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]]]
- We see that although the projection is SVY21, the CRS (coordinate reference system) is EPSG 9001, which is incorrect;
- We thus use
st_set_crs()to set it to 3414, which is the EPSG code for SVY21- we then use
st_crs()to confirm it is the correct EPSG code
- we then use
show code
raw_bus_stop_sf <- st_set_crs(raw_bus_stop_sf, 3414)Warning: st_crs<- : replacing crs does not reproject data; use st_transform for
that
show code
st_crs(raw_bus_stop_sf)Coordinate Reference System:
User input: EPSG:3414
wkt:
PROJCRS["SVY21 / Singapore TM",
BASEGEOGCRS["SVY21",
DATUM["SVY21",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4757]],
CONVERSION["Singapore Transverse Mercator",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["northing (N)",north,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["easting (E)",east,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Cadastre, engineering survey, topographic mapping."],
AREA["Singapore - onshore and offshore."],
BBOX[1.13,103.59,1.47,104.07]],
ID["EPSG",3414]]
- We now do a quick plot to quickly visualize the bus stops;
qtm()is a wrapper to do a quick, simple plot usingtmapwithout defining arguments- we use
tmap_mode("plot")to set the map as a static image, rather than as an interactive map, in order to save processing.
show code
tmap_mode("plot")tmap mode set to plotting
show code
qtm(raw_bus_stop_sf)
1.2.2 Visualizing within Singapore’s Administrative National Boundary
- This image is hard to read; we can vaguely discern Singapore’s Southern coastline, but it can be hard to visualize
- I have sourced and downloaded supplemental data about Singapore’s Administrative National Boundary (“SANB”) from igismap.com as a base layer for visualization;
- We set
tmap_mode("view")to allow us to scroll and confirm the SARB boundaries - As before, we use
st_read()to load the shapefile data, andst_transform()to ensure the projection is correct- We use
tm_shape() + tm_polygons()to map a grey layer of the SARB boundaries; - On top of which, we layer
tm_shape() + tm_dots()to show the bus stops.
- We use
- We set
show code
sanb_sf <- st_read(dsn = "data/geospatial",
layer = "singapore_administrative_national_boundary") %>%
st_transform(crs = 3414)Reading layer `singapore_administrative_national_boundary' from data source
`C:\1darren\ISSS624\Take-home_Ex01\data\geospatial' using driver `ESRI Shapefile'
Simple feature collection with 1 feature and 15 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 103.6056 ymin: 1.158682 xmax: 104.4074 ymax: 1.471564
Geodetic CRS: WGS 84
show code
tmap_mode("view")tmap mode set to interactive viewing
show code
tm_shape(sanb_sf) +
tm_polygons() +
tm_shape(raw_bus_stop_sf) +
tm_dots()show code
## Additional data from: Data.gov.sg, https://beta.data.gov.sg/datasets/d_02cba6aeeed323b5f6c723527757c0bc/viewThat’s because the Singapore National Administrative Boundaries map includes Pedra Branca, the much-disputed outlying island and easternmost point of Singapore.
It’s an amusing artifact here, but will not be involved in further analysis later.
- We note there are a number of bus stops outside Singapore’s boundaries; Specifically, three bus stops in a cluster just outside the Causeway, and one further North.
- We perform several steps to isolate & check the data;
- we use
st_filter()to find bus stops within Singapore’s Administrative National Boundaries, and createsg_bus_stop_sffor future use - to check what bus stops have been dropped, we use
anti_join()to compareraw_bus_stop_sfwithsg_bus_stop_sf. We usest_drop_geometryon bothsfdataframes to
- we use
show code
sg_bus_stop_sf <- st_filter(raw_bus_stop_sf, sanb_sf)
anti_join(st_drop_geometry(raw_bus_stop_sf), st_drop_geometry(sg_bus_stop_sf))Joining with `by = join_by(BUS_STOP_N, BUS_ROOF_N, LOC_DESC)`
BUS_STOP_N BUS_ROOF_N LOC_DESC
1 47701 NIL JB SENTRAL
2 46239 NA LARKIN TER
3 46609 NA KOTARAYA II TER
4 46211 NIL JOHOR BAHRU CHECKPT
5 46219 NIL JOHOR BAHRU CHECKPT
- We see there are in fact 5 bus stops outside of Singapore (including the defunct Kotaraya II Terminal) that have been removed, which means our data cleaning was correct.
1.3 Geospatial Data Cleaning
1.3.1 Removing Duplicate Bus Stops
- But, do we need to do more data cleaning?
- We use
length()to find the total number of raw values in the$BUS_STOP_Ncolumn ofsg_bus_stop_sf- We then compare this to
length(unique())to find the unique values
- We then compare this to
- And, indeed, we find there are 16 bus stops that are repeated;
cat("Total number of rows in sg_bus_stop_sf\t\t: ", paste0(length(sg_bus_stop_sf$BUS_STOP_N)))Total number of rows in sg_bus_stop_sf : 5156
cat("\nTotal unique bus stop IDs in sg_bus_stop_sf\t: ", paste0(length(unique(sg_bus_stop_sf$BUS_STOP_N))))
Total unique bus stop IDs in sg_bus_stop_sf : 5140
cat("\nRepeated bus stops\t\t\t\t: ", paste0(length(raw_bus_stop_sf$BUS_STOP_N) - length(unique(raw_bus_stop_sf$BUS_STOP_N))))
Repeated bus stops : 16
- It appears there are 16 datapoints that are specifically repeated; let’s identify the bus stop numbers with repeated rows:
- First we use
filter()with a pipe mark (usingorcondition) to identify repeated bus stop numbers (i.e.$BUS_STOP_N) - We sort them in ascending order using
arrange(); then, usinggroup_by()androw_number()we add row numbers based on$BUS_STOP_Nto a new column usingmutate()
- First we use
show code
repeated_df <- sg_bus_stop_sf %>%
filter(duplicated(BUS_STOP_N) | duplicated(BUS_STOP_N, fromLast = TRUE)) %>%
arrange(BUS_STOP_N) %>%
group_by(BUS_STOP_N) %>%
mutate(RowNumber = row_number())
repeated_dfSimple feature collection with 32 features and 4 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 13488.02 ymin: 32594.17 xmax: 44144.57 ymax: 47934
Projected CRS: SVY21 / Singapore TM
# A tibble: 32 × 5
# Groups: BUS_STOP_N [16]
BUS_STOP_N BUS_ROOF_N LOC_DESC geometry RowNumber
* <chr> <chr> <chr> <POINT [m]> <int>
1 11009 B04 Ghim Moh Ter (23101.34 32594.17) 1
2 11009 B04-TMNL GHIM MOH TER (23100.58 32604.36) 2
3 22501 B02 Blk 662A (13489.09 35536.4) 1
4 22501 B02 BLK 662A (13488.02 35537.88) 2
5 43709 B06 BLK 644 (18963.42 36762.8) 1
6 43709 B06 BLK 644 (18952.02 36751.83) 2
7 47201 UNK <NA> (22616.75 47793.68) 1
8 47201 NIL W'LANDS NTH STN (22632.92 47934) 2
9 51071 B21 MACRITCHIE RESERVO… (28311.27 36036.92) 1
10 51071 B21 MACRITCHIE RESERVO… (28282.54 36033.93) 2
# ℹ 22 more rows
- From examination, there are 32 bus stops sharing 16 bus stop numbers – 16 pairs of bus stops sharing the same number.
- Let’s examine these bus stop pairs on the map;
- we use
mapview()to display these repeated bus stops on the map - we use
col="BUS_STOP_N"withpalette="Spectralto give each pair of bus stops an individual colour
- we use
show code
tmap_mode("plot")tmap mode set to plotting
show code
tm_shape(sanb_sf) +
tm_polygons() +
tm_shape(repeated_df) +
tm_dots(col = "BUS_STOP_N", palette = "Spectral")
- After confirming with Prof Kam, we will simply drop the second instance of the rows.
- we use
duplicated()to identify rows with repeated values of$BUS_STOP_N; duplicated rows will returnTRUEwhile all other rows will returnFALSE - We use
!to invert the values, so only the unduplicated rows will returnTRUE. - We then use square brackets
[]to indexsg_bus_stop_sfbased on the rows, and return only the unduplicated rows;
- we use
show code
sg_bus_stop_sf <- sg_bus_stop_sf[!duplicated(sg_bus_stop_sf$BUS_STOP_N), ]
head(sg_bus_stop_sf)Simple feature collection with 6 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 13228.59 ymin: 30391.85 xmax: 41603.76 ymax: 44206.38
Projected CRS: SVY21 / Singapore TM
BUS_STOP_N BUS_ROOF_N LOC_DESC geometry
1 22069 B06 OPP CEVA LOGISTICS POINT (13576.31 32883.65)
2 32071 B23 AFT TRACK 13 POINT (13228.59 44206.38)
3 44331 B01 BLK 239 POINT (21045.1 40242.08)
4 96081 B05 GRACE INDEPENDENT CH POINT (41603.76 35413.11)
5 11561 B05 BLK 166 POINT (24568.74 30391.85)
6 66191 B03 AFT CORFE PL POINT (30951.58 38079.61)
::: {.callout-warning collapse=“true”} ## !LONG! Alternative manual cleaning steps;
I was unable to take Prof Kam’s Data Analytics Lab, but I know of his fervour and attention to detail. I believe in informed choices, and so I performed manual cleaning with the following steps: - Remove the rows with lowercase names, as most $LOC_DESC are in strict uppercase - Remove the rows with NA $LOC_DESC - Remove the row with NIL $LOC_DESC - For remaining rows, drop second entry - Retain remaining rows - After this, we just run the same steps on sg_bus_stop_sf or perform an anti-join.
However, after clarification with Prof Kam, we just drop the second entry. My code is shown below for extra gratification
show code
drop_second_stop = c("43709", "51071", "53041", "52059", "58031", "68091", "68099", "97079")
rows_to_retain_df <- repeated_df %>%
filter(
case_when(
BUS_STOP_N == "11009" & grepl("[a-z]", LOC_DESC) ~ FALSE,
BUS_STOP_N == "22501" & grepl("[a-z]", LOC_DESC) ~ FALSE,
BUS_STOP_N == "77329" & grepl("[a-z]", LOC_DESC) ~ FALSE,
BUS_STOP_N == "82221" & grepl("[a-z]", LOC_DESC) ~ FALSE,
BUS_STOP_N == "62251" & grepl("[a-z]", LOC_DESC) ~ FALSE,
BUS_STOP_N == "96319" & grepl("[a-z]", LOC_DESC) ~ FALSE,
BUS_STOP_N == "47201" & is.na(LOC_DESC) ~ FALSE,
BUS_STOP_N == "67421" & BUS_ROOF_N == "NIL" ~ FALSE,
BUS_STOP_N %in% drop_second_stop & RowNumber == 2 ~ FALSE,
TRUE ~ TRUE
)
)
rows_to_retain_df$LOC_DESC = toupper(rows_to_retain_df$LOC_DESC)
print("Printing rows to retain:")[1] "Printing rows to retain:"
show code
rows_to_retain_dfSimple feature collection with 16 features and 4 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 13488.02 ymin: 32604.36 xmax: 44144.57 ymax: 47934
Projected CRS: SVY21 / Singapore TM
# A tibble: 16 × 5
# Groups: BUS_STOP_N [16]
BUS_STOP_N BUS_ROOF_N LOC_DESC geometry RowNumber
* <chr> <chr> <chr> <POINT [m]> <int>
1 11009 B04-TMNL GHIM MOH TER (23100.58 32604.36) 2
2 22501 B02 BLK 662A (13488.02 35537.88) 2
3 43709 B06 BLK 644 (18963.42 36762.8) 1
4 47201 NIL W'LANDS NTH STN (22632.92 47934) 2
5 51071 B21 MACRITCHIE RESERVO… (28311.27 36036.92) 1
6 52059 B03 OPP BLK 65 (30770.3 34460.06) 1
7 53041 B05 UPP THOMSON ROAD (28105.8 37246.76) 1
8 58031 UNK OPP CANBERRA DR (27089.69 47570.9) 1
9 62251 B03 BEF BLK 471B (35500.36 39943.34) 2
10 67421 B01 CHENG LIM STN EXIT… (34548.54 42052.15) 1
11 68091 B01 AFT BAKER ST (32164.11 42695.98) 1
12 68099 B02 BEF BAKER ST (32154.9 42742.82) 1
13 77329 B01 BEF PASIR RIS ST 53 (40765.35 39452.18) 1
14 82221 B01 BLK 3A (35323.6 33257.05) 1
15 96319 NIL YUSEN LOGISTICS (42187.23 34995.78) 2
16 97079 B14 OPP ST. JOHN'S CRES (44144.57 38980.25) 1
- If we run the steps from above, we can see that there are no repeated bus stops.
cat("Total number of rows in sg_bus_stop_sf\t\t: ", paste0(length(sg_bus_stop_sf$BUS_STOP_N)))Total number of rows in sg_bus_stop_sf : 5140
cat("\nTotal unique bus stop IDs in sg_bus_stop_sf\t: ", paste0(length(unique(sg_bus_stop_sf$BUS_STOP_N))))
Total unique bus stop IDs in sg_bus_stop_sf : 5140
cat("\nRepeated bus stops\t\t\t\t: ", paste0(length(sg_bus_stop_sf$BUS_STOP_N) - length(unique(sg_bus_stop_sf$BUS_STOP_N))))
Repeated bus stops : 0
1.4 Generating Hex Maps
- We use
st_make_grid()withsquare = FALSEto create thehexagonlayer as defined in the study, which we nameraw_hex_grid- We pass
cellsize = 500to create the hexagons of necessary size. Prof Kam defined the apothem as 250m, which means the distance between opposite edges is double that, or 500m- I used
units::as_unitsto pass 500 metres into the argument. I am still uncertain whether a length of 500m needs to be reprojected, or whether we need to do any further transformation.
- I used
- We pass
- We use
st_sf()to convertraw_hex_gridinto ansfdataframemutate()is used here to create agrid_idcolumn- We just use
st_transform()in case we need to reproject the coordinate system, just in case.
- However, trying to visualize this right now just gives us a map full of hexagons.
show code
raw_hex_grid = st_make_grid(sg_bus_stop_sf, cellsize = units::as_units(500, "m"), what = "polygons", square = FALSE) %>%
st_transform(crs = 3414)
# To sf and add grid ID
raw_hex_grid <- st_sf(raw_hex_grid) %>%
# add grid ID
mutate(grid_id = 1:length(lengths(raw_hex_grid))) %>%
st_transform(crs = 3414)
tmap_mode("plot")tmap mode set to plotting
show code
qtm(raw_hex_grid)
- What we will need is to isolate only the hexes with bus stops in them.
- We use
lengths(st_intersects())to count the number of bus stops in each hex, and add that to a new column,$n_bus_stops - We then create
hexagon_sfby filtering for only hexes with non-zero bus stops in each.
- We use
- We then plot these using the usual
tmap()functions;tm_basemap()is used to create a “basemap” layer underneath to orient our hexes.- We used OneMapSG as our comprehensive map of Singapore; if you zoom in, you can actually count the number of bus stops in red on the map.
show code
# Count number of points in each grid, code snippet referenced from:
# https://gis.stackexchange.com/questions/323698/counting-points-in-polygons-with-sf-package-of-r
raw_hex_grid$n_bus_stops = lengths(st_intersects(raw_hex_grid, sg_bus_stop_sf))
# remove grid without value of 0 (i.e. no points inside that grid)
hexagon_sf = filter(raw_hex_grid, n_bus_stops > 0)
# head(hexagon_sf)
tmap_mode("view")tmap mode set to interactive viewing
show code
tm_basemap(providers$OneMapSG.Grey) +
tm_shape(hexagon_sf) +
tm_fill(
col = "n_bus_stops",
palette = "-plasma",
style = "cont",
breaks = c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12),
title = "Number of bus_stops",
id = "grid_id",
showNA = FALSE,
alpha = 0.6,
popup.vars = c(
"Number of Bus Stops: " = "n_bus_stops"
),
popup.format = list(
n_bus_stops = list(format = "f", digits = 0)
)
) +
tm_borders(col = "grey40", lwd = 0.7)- There seem to be 2 regions of deep purple, centred over an area near, but not exactly over Pasir Ris and Choa Chu Kang MRTs.
- We perform some simple stats to count the total number of filtered hexes, and to see the maximum number of bus stops in a hex.
show code
cat(paste("Total number of raw hexes is\t\t\t: ", nrow(raw_hex_grid), "\n"))Total number of raw hexes is : 5040
show code
cat(paste("Total number of hexes (n_bus_stop > 1) is\t: ", nrow(hexagon_sf)), "\n")Total number of hexes (n_bus_stop > 1) is : 1520
show code
cat("\nPrinting map_hexagon_sf:\n >> ")
Printing map_hexagon_sf:
>>
show code
hexagon_sf[hexagon_sf$n_bus_stops > 10, ]Simple feature collection with 2 features and 2 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 17470.12 ymin: 38317.78 xmax: 42720.12 ymax: 40194.17
Projected CRS: SVY21 / Singapore TM
raw_hex_grid grid_id n_bus_stops
304 POLYGON ((17720.12 39616.82... 1584 11
1462 POLYGON ((42470.12 38317.78... 4355 11
0.3.1 Load Bus Trip Data
show code
# odbus_2308 <- read_csv("data/aspatial/origin_destination_bus_202308.csv")
#
# # Drop pt_type
# odbus_2308 <- select(odbus_2308, -PT_TYPE)
# # alternative way in read_csv df <- read_csv("data/aspatial/origin_destination_bus_202308.csv", col_types = "ccdcffd")
#
# odbus_2308$ORIGIN_PT_CODE <- as.factor(odbus_2308$ORIGIN_PT_CODE)
# odbus_2308$DESTINATION_PT_CODE <- as.factor(odbus_2308$DESTINATION_PT_CODE)
# glimpse(odbus_2308)- sanity check number of distinct bus stops over months
# odbus_2309 <- read_csv("data/aspatial/origin_destination_bus_202309.csv")
odbus_2310 <- read_csv("data/aspatial/origin_destination_bus_202310.csv")Rows: 5694297 Columns: 7
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (5): YEAR_MONTH, DAY_TYPE, PT_TYPE, ORIGIN_PT_CODE, DESTINATION_PT_CODE
dbl (2): TIME_PER_HOUR, TOTAL_TRIPS
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Drop pt_type
# odbus_2309 <- select(odbus_2309, -PT_TYPE)
odbus_2310 <- select(odbus_2310, -PT_TYPE)
# alternative way in read_csv df <- read_csv("data/aspatial/origin_destination_bus_202308.csv", col_types = "ccdcffd")
# odbus_2309$ORIGIN_PT_CODE <- as.factor(odbus_2309$ORIGIN_PT_CODE)
odbus_2310$ORIGIN_PT_CODE <- as.factor(odbus_2310$ORIGIN_PT_CODE)
# cat("Confirm distinct origin bus stops 23-08: \n>> ", paste(length(unique(odbus_2308$ORIGIN_PT_CODE))))
# cat("Confirm distinct origin bus stops 23-09: \n>> ", paste(length(unique(odbus_2309$ORIGIN_PT_CODE))))
cat("Confirm distinct origin bus stops 23-10: \n>> ", paste(length(unique(odbus_2310$ORIGIN_PT_CODE))))Confirm distinct origin bus stops 23-10:
>> 5073
First we split out weekday and then perform mutate to group by
# odbus_2310_weekday = odbus_2310 %>%
# filter(DAY_TYPE == "WEEKDAY") %>%
# mutate(PEAK = case_when(
# TIME_PER_HOUR >= 6 & TIME_PER_HOUR <= 9 ~ "MORNING PEAK",
# TIME_PER_HOUR >= 17 & TIME_PER_HOUR <= 20 ~ "AFTEROON PEAK",
# TRUE ~ "Unknown"
# ))
#cat("Confirm $DAY_TYPE only has `WEEKDAY` value: \n>> ", paste(unique(odbus_2310_weekday$DAY_TYPE)))
#odbus_2310_weekday
odbus_filtered <- odbus_2310 %>%
mutate(PEAK = case_when(
DAY_TYPE == "WEEKDAY" & TIME_PER_HOUR >= 6 & TIME_PER_HOUR <= 9 ~ "WEEKDAY MORNING",
DAY_TYPE == "WEEKDAY" & TIME_PER_HOUR >= 17 & TIME_PER_HOUR <= 20 ~ "WEEKDAY AFTERNOON",
DAY_TYPE == "WEEKENDS/HOLIDAY" & TIME_PER_HOUR >= 11 & TIME_PER_HOUR <= 14 ~ "WEEKEND MORNING",
DAY_TYPE == "WEEKENDS/HOLIDAY" & TIME_PER_HOUR >= 16 & TIME_PER_HOUR <= 19 ~ "WEEKEND EVENING",
TRUE ~ "Unknown"
)) %>%
filter(
case_when(
PEAK == "Unknown" ~ FALSE,
TRUE ~ TRUE
)) %>%
group_by(ORIGIN_PT_CODE, PEAK) %>%
summarise(TRIPS = sum(TOTAL_TRIPS))`summarise()` has grouped output by 'ORIGIN_PT_CODE'. You can override using
the `.groups` argument.
odbus_filtered <- odbus_filtered %>%
pivot_wider(names_from = PEAK, values_from = TRIPS, values_fill = 0)
write_rds(odbus_filtered, "data/rds/odbus_filtered.rds")
head(odbus_filtered)# A tibble: 6 × 5
# Groups: ORIGIN_PT_CODE [6]
ORIGIN_PT_CODE `WEEKDAY AFTERNOON` `WEEKDAY MORNING` `WEEKEND EVENING`
<fct> <dbl> <dbl> <dbl>
1 01012 8000 1770 3061
2 01013 7038 841 2770
3 01019 3398 1530 1685
4 01029 9089 2483 4063
5 01039 12095 2919 7263
6 01059 2212 1734 1118
# ℹ 1 more variable: `WEEKEND MORNING` <dbl>
Sanity check, number of distinct bus stops?
cat("Confirm distinct origin bus stops: \n>> ", paste(length(unique(odbus_2310$ORIGIN_PT_CODE))))Confirm distinct origin bus stops:
>> 5073
cat("\nConfirm distinct destination stops: \n>> ", paste(length(unique(odbus_2310$DESTINATION_PT_CODE ))))
Confirm distinct destination stops:
>> 5077
cat("\nConfirm distinct bus stops in `bus_stop_sf`: \n>> ", paste(length(unique(sg_bus_stop_sf$BUS_STOP_N))))
Confirm distinct bus stops in `bus_stop_sf`:
>> 5140
# odbus_2310_copy <- odbus_2310 %>%
# rename(BUS_STOP_N = ORIGIN_PT_CODE)
#
# non_hexagon_sf <- anti_join(bus_stop_sf, odbus_2310_copy, by = "BUS_STOP_N")
#
# cat("\nConfirm distinct bus stops in `non_hexagon_sf`: \n>> ", paste(length(unique(non_hexagon_sf$BUS_STOP_N ))))
#
# mapview_non_hexag = mapview(non_hexagon_sf, cex = 3, alpha = .5, popup = NULL)
# mapview_non_hexag
#
# distinct_odbus <- distinct(select(odbus_2310, ORIGIN_PT_CODE)) #5073
# distinct_busstop <- distinct(select(bus_stop_sf, BUS_STOP_N)%>%
# st_drop_geometry()) #5145
#
# anti_join(distinct_odbus, distinct_busstop, by=c("ORIGIN_PT_CODE" ="BUS_STOP_N")) # 60
# anti_join(distinct_busstop, distinct_odbus, by=c("BUS_STOP_N" = "ORIGIN_PT_CODE")) #132- Test to reproduce
01013bus stop with 841 trips on weekday mornings
show code
subset_df <- odbus_2310 %>%
filter(ORIGIN_PT_CODE == '01013') %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 6) %>%
filter(TIME_PER_HOUR <= 9) %>%
group_by(ORIGIN_PT_CODE, DAY_TYPE, TIME_PER_HOUR, YEAR_MONTH) %>%
summarise(TRIPS = sum(TOTAL_TRIPS), .groups = 'keep')
head(subset_df)# A tibble: 4 × 5
# Groups: ORIGIN_PT_CODE, DAY_TYPE, TIME_PER_HOUR, YEAR_MONTH [4]
ORIGIN_PT_CODE DAY_TYPE TIME_PER_HOUR YEAR_MONTH TRIPS
<fct> <chr> <dbl> <chr> <dbl>
1 01013 WEEKDAY 6 2023-10 180
2 01013 WEEKDAY 7 2023-10 138
3 01013 WEEKDAY 8 2023-10 254
4 01013 WEEKDAY 9 2023-10 269
show code
180 + 138 + 254 + 269[1] 841
Create bus_stop_hexgrid_id
- Combine hexagon_sf with bus_stop_sf
show code
bus_stop_hexgrid_id <- st_intersection(sg_bus_stop_sf, hexagon_sf) %>%
select(grid_id, BUS_STOP_N) %>%
st_drop_geometry()Warning: attribute variables are assumed to be spatially constant throughout
all geometries
show code
cat("\nNumber of bus stops\t:", length(unique(bus_stop_hexgrid_id$BUS_STOP_N)))
Number of bus stops : 5140
show code
cat("\nNumber of hexgrids\t:", length(unique(bus_stop_hexgrid_id$grid_id)))
Number of hexgrids : 1520
show code
head(bus_stop_hexgrid_id) grid_id BUS_STOP_N
3265 31 25059
2566 59 25751
254 90 26379
2893 115 25761
2823 117 25719
4199 117 26389
Combine bus_stop_hexgrid_id with trip details
- Combine bus_stop_hexgrid_id with odbus_filtered
show code
grid_trips_df <- left_join(odbus_filtered, bus_stop_hexgrid_id,
by = c("ORIGIN_PT_CODE" = "BUS_STOP_N")) %>%
select(grid_id,
`WEEKDAY MORNING`,
`WEEKDAY AFTERNOON`,
`WEEKEND MORNING`,
`WEEKEND EVENING`) %>%
group_by(grid_id) %>%
summarise(
WEEKDAY_MORNING_TRIPS = sum(`WEEKDAY MORNING`),
WEEKDAY_AFTERNOON_TRIPS = sum(`WEEKDAY AFTERNOON`),
WEEKEND_MORNING_TRIPS = sum(`WEEKEND MORNING`),
WEEKEND_EVENING_TRIPS = sum(`WEEKEND EVENING`)
)Adding missing grouping variables: `ORIGIN_PT_CODE`
show code
head(grid_trips_df)# A tibble: 6 × 5
grid_id WEEKDAY_MORNING_TRIPS WEEKDAY_AFTERNOON_TRIPS WEEKEND_MORNING_TRIPS
<int> <dbl> <dbl> <dbl>
1 31 103 390 0
2 59 52 114 26
3 90 78 291 52
4 115 185 1905 187
5 117 1116 2899 455
6 118 53 241 76
# ℹ 1 more variable: WEEKEND_EVENING_TRIPS <dbl>
Left join back to hexagon_df
show code
hexagon_sf_2 <- left_join(hexagon_sf, grid_trips_df,
by = 'grid_id' ) %>%
mutate(
WEEKDAY_MORNING_TRIPS = ifelse(is.na(WEEKDAY_MORNING_TRIPS), 0, WEEKDAY_MORNING_TRIPS),
WEEKDAY_AFTERNOON_TRIPS = ifelse(is.na(WEEKDAY_AFTERNOON_TRIPS), 0, WEEKDAY_AFTERNOON_TRIPS),
WEEKEND_MORNING_TRIPS = ifelse(is.na(WEEKEND_MORNING_TRIPS), 0, WEEKEND_MORNING_TRIPS),
WEEKEND_EVENING_TRIPS = ifelse(is.na(WEEKEND_EVENING_TRIPS), 0, WEEKEND_EVENING_TRIPS),
)
head(hexagon_sf_2)Simple feature collection with 6 features and 6 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 3720.122 ymin: 27925.48 xmax: 4970.122 ymax: 31533.92
Projected CRS: SVY21 / Singapore TM
grid_id n_bus_stops WEEKDAY_MORNING_TRIPS WEEKDAY_AFTERNOON_TRIPS
1 31 1 103 390
2 59 1 52 114
3 90 1 78 291
4 115 1 185 1905
5 117 2 1116 2899
6 118 1 53 241
WEEKEND_MORNING_TRIPS WEEKEND_EVENING_TRIPS raw_hex_grid
1 0 56 POLYGON ((3970.122 27925.48...
2 26 14 POLYGON ((4220.122 28358.49...
3 52 100 POLYGON ((4470.122 30523.55...
4 187 346 POLYGON ((4720.122 28358.49...
5 455 634 POLYGON ((4720.122 30090.54...
6 76 55 POLYGON ((4720.122 30956.57...
- Now to try to do data analysis
- In my defense, I emailed prof kam for DAL notes but he ignroed me for 6 months
show code
summary(hexagon_sf_2$WEEKDAY_MORNING_TRIPS) Min. 1st Qu. Median Mean 3rd Qu. Max.
0 919 7566 16812 23245 386450
show code
hist(hexagon_sf_2$WEEKDAY_MORNING_TRIPS,
main = "Histogram Example",
xlab = "WEEKDAY_MORNING_TRIPS",
col = "lightblue",
border = "black")
show code
# Load the ggplot2 package
library(ggplot2)
trip_col_names <- c("WEEKDAY_MORNING_TRIPS", "WEEKDAY_AFTERNOON_TRIPS", "WEEKEND_MORNING_TRIPS", "WEEKEND_EVENING_TRIPS")
par(mfrow = c(2, 2)) # Set up a 2x2 layout
custom_breaks <- seq(0, 550000, by = 50000)
for (col in trip_col_names) {
hist(hexagon_sf_2[[col]], main = col, col = "lightblue", border = "black",
breaks = custom_breaks)
}
show code
colnames(hexagon_sf_2)[1] "grid_id" "n_bus_stops"
[3] "WEEKDAY_MORNING_TRIPS" "WEEKDAY_AFTERNOON_TRIPS"
[5] "WEEKEND_MORNING_TRIPS" "WEEKEND_EVENING_TRIPS"
[7] "raw_hex_grid"
Map hexes on SIngapore
show code
sarb_sf <- st_read(dsn = "data/geospatial",
layer = "Subzone_Census2010") %>%
st_transform(crs = 3414)Reading layer `Subzone_Census2010' from data source
`C:\1darren\ISSS624\Take-home_Ex01\data\geospatial' using driver `ESRI Shapefile'
Simple feature collection with 311 features and 15 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 2637.869 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
Projected CRS: SVY21
show code
tmap_options(check.and.fix = TRUE)
qtm(sarb_sf)Warning: The shape sarb_sf is invalid (after reprojection). See sf::st_is_valid
show code
## Additional data from: Data.gov.sg, https://beta.data.gov.sg/datasets/d_02cba6aeeed323b5f6c723527757c0bc/viewshow code
tm_shape(sarb_sf) +
tm_polygons() +
tm_shape(hexagon_sf_2) +
tm_fill("WEEKDAY_MORNING_TRIPS",
style = "quantile",
palette = "Blues",
title = "Number of trips")Warning: The shape sarb_sf is invalid (after reprojection). See sf::st_is_valid
show code
tm_shape(sarb_sf) +
tm_polygons() +
tm_shape(hexagon_sf_2) +
tm_fill("WEEKDAY_AFTERNOON_TRIPS",
style = "quantile",
palette = "Greens",
title = "Dependency ratio")Warning: The shape sarb_sf is invalid (after reprojection). See sf::st_is_valid
tm_shape(sarb_sf) +
tm_polygons() +
tm_shape(hexagon_sf_2) +
tm_fill("WEEKEND_MORNING_TRIPS",
style = "quantile",
palette = "Reds",
title = "Dependency ratio")Warning: The shape sarb_sf is invalid (after reprojection). See sf::st_is_valid
show code
tm_shape(sarb_sf) +
tm_polygons() +
tm_shape(hexagon_sf_2) +
tm_fill("WEEKEND_EVENING_TRIPS",
style = "quantile",
palette = "Oranges",
title = "Dependency ratio")Warning: The shape sarb_sf is invalid (after reprojection). See sf::st_is_valid
show code
# tm_shape(sarb_sf) +
# tm_polygons() +
# tm_shape(hexagon_sf_2) +
# tm_fill("WEEKDAY_MORNING_TRIPS",
# style = "cont",
# palette = "viridis",
# breaks = custom_breaks,
# title = "Dependency ratio")
# tm_shape(sarb_sf) +
# tm_polygons() +
# tm_shape(hexagon_sf_2) +
# tm_fill("WEEKDAY_AFTERNOON_TRIPS",
# style = "cont",
# palette = "viridis",
# breaks = custom_breaks,
# title = "Dependency ratio")
# tm_shape(sarb_sf) +
# tm_polygons() +
# tm_shape(hexagon_sf_2) +
# tm_fill("WEEKEND_MORNING_TRIPS",
# style = "cont",
# palette = "viridis",
# breaks = custom_breaks,
# title = "Dependency ratio")
# tm_shape(sarb_sf) +
# tm_polygons() +
# tm_shape(hexagon_sf_2) +
# tm_fill("WEEKEND_EVENING_TRIPS",
# style = "cont",
# palette = "viridis",
# breaks = custom_breaks,
# title = "Dependency ratio")Plot Weekday Morning
show code
# mapview(hexagon_sf_2, zcol="WEEKDAY_MORNING_TRIPS", cex = 3, alpha = .5, popup = NULL)Plot Weekday Afternoons
show code
# mapview(hexagon_sf_2, zcol="WEEKDAY_AFTERNOON_TRIPS", cex = 3, alpha = .5, popup = NULL)Plot Weekend Morning
show code
# mapview(hexagon_sf_2, zcol="WEEKEND_MORNING_TRIPS", cex = 3, alpha = .5, popup = NULL)Plot Weekend Afternoons
show code
# mapview(hexagon_sf_2, zcol="WEEKEND_EVENING_TRIPS", cex = 3, alpha = .5, popup = NULL)- Using both fixed- and adaptive-distance neigbours generates similar results; values are largely in agreement
- knn=8 feels like it’s not very logical, but maybe there’s not too much difference
2. Create LISA
Create Queen contiguity weight matrix hex
show code
wm_hex <- st_contiguity(hexagon_sf_2, queen=TRUE)
summary(wm_hex)Neighbour list object:
Number of regions: 1520
Number of nonzero links: 6874
Percentage nonzero weights: 0.2975242
Average number of links: 4.522368
9 regions with no links:
561 726 980 1047 1415 1505 1508 1512 1520
Link number distribution:
0 1 2 3 4 5 6
9 39 109 205 291 364 503
39 least connected regions:
1 7 22 38 98 169 187 195 211 218 258 259 264 267 287 454 562 607 642 696 708 732 751 784 869 1021 1022 1046 1086 1214 1464 1471 1482 1500 1501 1503 1506 1510 1519 with 1 link
503 most connected regions:
10 13 16 17 24 25 31 35 42 43 48 53 55 60 63 67 73 77 80 81 84 85 87 88 91 92 97 102 107 111 117 121 127 133 140 141 143 148 149 150 154 155 156 157 163 164 165 173 174 175 183 184 185 191 192 193 194 200 201 202 205 206 207 208 216 229 239 243 244 246 257 266 271 278 279 283 284 291 292 298 300 301 302 304 309 310 312 313 316 321 324 325 327 337 338 339 340 343 352 355 363 368 390 391 400 402 403 407 414 418 423 425 431 436 437 438 440 443 450 451 452 461 466 467 468 469 473 477 480 481 485 489 493 494 496 502 503 507 513 514 517 518 523 529 534 539 543 548 549 550 552 556 558 564 568 573 574 576 577 581 590 591 594 598 599 604 605 609 615 619 624 626 633 636 637 638 648 649 650 654 655 657 658 659 669 670 671 677 680 681 682 687 688 690 691 700 701 704 705 706 713 716 717 724 727 728 729 740 741 755 757 758 760 771 774 775 776 777 782 783 787 788 789 792 793 794 795 799 800 806 807 810 811 812 813 819 820 823 824 825 830 831 832 841 843 844 846 847 848 850 851 852 853 854 860 863 865 866 867 871 872 876 877 878 880 881 882 884 885 887 888 891 893 896 899 902 905 906 910 914 919 921 926 927 928 930 931 935 937 943 944 945 946 947 948 954 958 959 962 963 968 969 971 972 973 977 984 985 986 987 988 990 996 997 998 999 1004 1011 1012 1013 1014 1024 1025 1026 1028 1029 1036 1037 1038 1042 1050 1051 1054 1056 1057 1062 1063 1064 1066 1067 1068 1069 1076 1078 1079 1080 1083 1089 1093 1100 1101 1102 1105 1106 1110 1111 1117 1120 1121 1122 1128 1133 1134 1135 1136 1141 1142 1144 1145 1146 1147 1148 1150 1156 1157 1158 1162 1163 1164 1166 1169 1170 1171 1172 1176 1177 1178 1179 1180 1184 1186 1190 1191 1192 1193 1194 1201 1202 1203 1204 1205 1206 1207 1210 1211 1217 1218 1219 1220 1221 1227 1233 1234 1235 1239 1244 1245 1251 1253 1254 1255 1261 1265 1266 1271 1272 1273 1277 1281 1283 1289 1299 1301 1302 1303 1304 1316 1318 1324 1325 1326 1327 1329 1330 1331 1334 1335 1336 1337 1343 1344 1345 1352 1353 1355 1356 1361 1365 1366 1368 1369 1371 1372 1377 1380 1381 1382 1384 1388 1391 1393 1395 1398 1406 1408 1412 1417 1418 1420 1424 1425 1426 1427 1428 1433 1434 1435 1436 1440 1441 1442 1446 1447 1449 1451 1453 1456 1457 1459 1460 1461 1462 1469 with 6 links
Create Queen contiguity
show code
hex_no_nb <- c(307, 565, 730, 984, 1051, 1419, 1509, 1512, 1516, 1524)
hexagon_sf_2_drop_nonb <- hexagon_sf_2 %>%
mutate(RowNumber = row_number()) %>%
subset( !(RowNumber %in% hex_no_nb))
vis_excluded_hexes <- tm_shape(sarb_sf) +
tm_polygons() +
tm_shape(hexagon_sf_2) +
tm_fill("red") +
tm_shape(hexagon_sf_2_drop_nonb) +
tm_fill("green")
vis_excluded_hexesWarning: The shape sarb_sf is invalid (after reprojection). See sf::st_is_valid
Drop JB Bus Stop
Find Northernmost point: JB Bus stop
show code
# coordinates <- st_coordinates(raw_bus_stop_sf$geometry)
#
# # Find the index of the northernmost point
# northernmost_index <- which.max(coordinates[, 2])
#
# # Extract the northernmost point
# northernmost_point <- raw_bus_stop_sf[northernmost_index, ]
# northernmost_point # 46239Identify the hex to drop
show code
# jb_grid_id <- pull(bus_stop_hexgrid_id[bus_stop_hexgrid_id$BUS_STOP_N == 46239, 'grid_id']) #1767
# hexagon_sf_2_drop_jb <-
# subset(hexagon_sf_2, !(grid_id == jb_grid_id))Now we figure out distance; first find centroid
show code
# longitude <- map_dbl(hexagon_sf_2_drop_jb$raw_hex_grid, ~st_centroid(.x)[[1]])
# latitude <- map_dbl(hexagon_sf_2_drop_jb$raw_hex_grid, ~st_centroid(.x)[[2]])
# coords <- cbind(longitude, latitude)
# # cat("Printing first 6 rows of `coords`:\n")
# # head(coords)
#
#
# k1_nn_obj <- st_knn(coords, k = 1)
# k1dists <- unlist(st_nb_dists(coords, k1_nn_obj))
# summary(k1dists)- 2km is a huge distance – we would enmesh hexes to about 4 hexes away
- bus stops are
show code
# check_dist <- st_nb_dists(coords, k1_nn_obj)
# which.max(check_dist) # 1523
#
#
# tm_shape(sarb_sf) +
# tm_polygons() +
# tm_shape(hexagon_sf_2_drop_jb) +
# tm_fill("cyan") +
# tm_shape(hexagon_sf_2_drop_jb[1523,]) +
# tm_fill("orange")Identify the hex to drop
show code
# naval_base_grid_id <- pull(bus_stop_hexgrid_id[bus_stop_hexgrid_id$BUS_STOP_N == 96439, 'grid_id']) #1767
# hexagon_sf_2_drop_changi <-
# subset(hexagon_sf_2_drop_jb, !(grid_id == naval_base_grid_id))
#
# longitude_2 <- map_dbl(hexagon_sf_2_drop_changi$raw_hex_grid, ~st_centroid(.x)[[1]])
# latitude_2 <- map_dbl(hexagon_sf_2_drop_changi$raw_hex_grid, ~st_centroid(.x)[[2]])
# coords_2 <- cbind(longitude_2, latitude_2)
# k1_nn_obj_2 <- st_knn(coords_2, k = 1)
# k1dists_2 <- unlist(st_nb_dists(coords_2, k1_nn_obj_2))
# summary(k1dists_2)- A much more manageable 1km; two hexes away. Let’s work with this.
Now we figure out distance; first find centroid
show code
# hex_1km_nb <- st_dist_band(coords_2, lower = 0, upper = 1000.1)
# head(hex_1km_nb, 5)
#
# hex_1km_wt <- st_weights(hex_1km_nb)
#
# head(hex_1km_wt, 5)Calculate Local Moran’s I
show code
# localMI_day_morn <- local_moran(hexagon_sf_2_drop_changi$WEEKDAY_MORNING_TRIPS , hex_1km_nb, hex_1km_wt)
#
# hexagon_sf_local_moran <- cbind(hexagon_sf_2_drop_changi,localMI_day_morn) Identify the hex to drop
show code
# local_moran_stats <- tm_shape(sarb_sf) +
# tm_polygons() +
# tm_shape(hexagon_sf_local_moran) +
# tm_fill(col = "ii",
# style = "pretty",
# palette = "RdBu",
# title = "Local Moran Statistics") +
# tm_borders(alpha = 0.5) +
# tm_layout(main.title = "Local Moran Statistics")
#
# local_moran_p <- tm_shape(sarb_sf) +
# tm_polygons() +
# tm_shape(hexagon_sf_local_moran) +
# tm_fill(col = "p_ii",
# breaks=c(-Inf, 0.001, 0.01, 0.05, 0.1, Inf),
# palette="-Greens",
# title = "local Moran's I p-values") +
# tm_borders(alpha = 0.5) +
# tm_layout(main.title = "Local Moran's I p-values")
#
# tmap_arrange(local_moran_stats,
# local_moran_p,
# asp=1,
# ncol=2)- But we only want to see the ones with p < 0.05, so we need to do some cleaning
show code
# hexagon_sf_local_moran_sig_only <- hexagon_sf_local_moran %>%
# subset( !(p_ii < 0.05))
#
# tm_shape(sarb_sf) +
# tm_polygons() +
# tm_shape(hexagon_sf_local_moran) +
# tm_fill("darkgrey") +
# tm_shape(hexagon_sf_local_moran_sig_only) +
# tm_fill(col = "ii",
# style = "pretty",
# palette = "RdBu",
# title = "Local Moran Statistics") +
# tm_borders(alpha = 0.5) +
# tm_layout(main.title = "Local Moran Statistics")Moran Scatterplot
show code
# Create lag
# hexagon_sf_local_moran <- hexagon_sf_local_moran %>%
# mutate(trips_lag = st_lag(hexagon_sf_local_moran$WEEKDAY_MORNING_TRIPS, hex_1km_nb, hex_1km_wt))
#
# ggplot(hexagon_sf_local_moran, aes(WEEKDAY_MORNING_TRIPS, trips_lag)) +
# geom_point() +
# geom_vline(xintercept = mean(hexagon_sf_local_moran$WEEKDAY_MORNING_TRIPS), linetype = "dashed", color = "red", size = 1.2) +
# geom_hline(yintercept = mean(hexagon_sf_local_moran$trips_lag), linetype = "dashed", color = "red", size = 1.5) +
# labs(title = "Scatterplot Example", x = "X-axis", y = "Y-axis")Plot Local Moran’s
show code
# hexa_sig <- hexagon_sf_local_moran %>%
# filter(p_ii_sim < 0.05)
#
#
# tmap_mode("plot")
# tm_shape(sarb_sf) +
# tm_polygons() +
# tm_shape(hexagon_sf_local_moran) +
# tm_fill("mean") +
# tm_borders(alpha = 0.5)